# How to do retrieval with contextual compression

:::info Prerequisites

This guide assumes familiarity with the following concepts:

- [Retrievers](/docs/concepts/retrievers)
- [Retrieval-augmented generation (RAG)](/docs/tutorials/rag)

:::

One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.

Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.

To use the Contextual Compression Retriever, you'll need:

- a base retriever
- a Document Compressor

The Contextual Compression Retriever passes queries to the base retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.

## Using a vanilla vector store retriever

Let's start by initializing a simple vector store retriever and storing the 2023 State of the Union speech (in chunks).
Given an example question, our retriever returns one or two relevant docs and a few irrelevant docs, and even the relevant docs have a lot of irrelevant information in them.
To extract all the context we can, we use an `LLMChainExtractor`, which will iterate over the initially returned documents and extract from each only the content that is relevant to the query.

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/openai @langchain/community @langchain/core
```

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/retrievers/contextual_compression.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>

## `EmbeddingsFilter`

Making an extra LLM call over each retrieved document is expensive and slow. The `EmbeddingsFilter` provides a cheaper and faster option by embedding the documents and query and only returning those documents which have sufficiently similar embeddings to the query.

This is most useful for non-vector store retrievers where we may not have control over the returned chunk size, or as part of a pipeline, outlined below.

Here's an example:

import EmbeddingsFilterExample from "@examples/retrievers/embeddings_filter.ts";

<CodeBlock language="typescript">{EmbeddingsFilterExample}</CodeBlock>

## Stringing compressors and document transformers together

Using the `DocumentCompressorPipeline` we can also easily combine multiple compressors in sequence. Along with compressors we can add BaseDocumentTransformers to our pipeline, which don't perform any contextual compression but simply perform some transformation on a set of documents.
For example `TextSplitters` can be used as document transformers to split documents into smaller pieces, and the `EmbeddingsFilter` can be used to filter out documents based on similarity of the individual chunks to the input query.

Below we create a compressor pipeline by first splitting raw webpage documents retrieved from the [Tavily web search API retriever](/docs/integrations/retrievers/tavily) into smaller chunks, then filtering based on relevance to the query.
The result is smaller chunks that are semantically similar to the input query.
This skips the need to add documents to a vector store to perform similarity search, which can be useful for one-off use cases:

import DocumentCompressorPipelineExample from "@examples/retrievers/document_compressor_pipeline.ts";

<CodeBlock language="typescript">{DocumentCompressorPipelineExample}</CodeBlock>

## Next steps

You've now learned a few ways to use contextual compression to remove bad data from your results.

See the individual sections for deeper dives on specific retrievers, the [broader tutorial on RAG](/docs/tutorials/rag), or this section to learn how to
[create your own custom retriever over any data source](/docs/how_to/custom_retriever/).
